{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 神经网络的学习\n",
    "这里所说的神经网络的\"学习\"是指**从训练数据中自动获取最优权重参数**的过程。<br>\n",
    "为了使神经网络能进行学习，将导入**损失函数**这一指标。<br>\n",
    "学习的目的就是以该损失函数为基准，找出能使它的值达到最小的权重参数。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 从数据中学习\n",
    "神经网络的特诊就是可以从数据中学习。所谓“从数据中学习”，是指可以**由数据自动决定权重参数的值**。\n",
    "#### 数据驱动\n",
    "数据是机器学习的命根子。\n",
    "#### 训练数据和测试数据\n",
    "机器学习中，一般将数据分为**训练数据(监督数据)**和**测试数据**两部分来进行学习和实验等。<br>\n",
    "首先，使用训练数据进行学习，寻找最优的参数；然后，使用测试数据评价训练得到的模型的实际能力。<br>\n",
    "为了正确评价模型的**泛化能力**，就必须划分训练数据和测试数据。泛化能力是指处理未被观察过的数据(不包含在训练数据中的数据)的能力。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 损失函数\n",
    "神经网络以某个指标为线索寻找最优权重参数。神经网络的学习中所用的指标称为**损失函数(loss function)**。损失函数可以使用任意函数，单一般用均方误差和交叉熵误差等。\n",
    "#### 均方误差\n",
    "均方误差(mean squared error)如下式所示。\n",
    "$$E = \\frac{1}{2}\\sum_{k}^{ }(y_k - t_k)^2$$\n",
    "这里，$y_k$表示神经网络的输出，$t_k$表示监督数据，k表示数据的维数。\n",
    "\n",
    "\n",
    "\n",
    "在手写数字识别的例子中，$y_k$、$t_k$是由如下10个元素构成的数据。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {},
   "outputs": [],
   "source": [
    "y = [0.1, 0.05, 0.6, 0.0, 0.05, 0.1, 0.0, 0.1, 0.0, 0.0]\n",
    "t = [0, 0, 1, 0, 0, 0, 0, 0, 0, 0] # “2”为正确解"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "这里，神经网络的输出y是softmax函数的输出。由于softmax函数的输出可以理解为概率，因此上例表示“0”的概率为0.1，“1”的概率是0.05，“2”的概率是0.6等。t是监督数据，将正确解标签设为1，其他设为0。这里，标签“2”位1，表示正确解是“2”。将正确解标签表示1，其他标签表示为0的表示方法称为**one-hot表示**。\n",
    "\n",
    "\n",
    "我们用Python来实现均方误差。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "\n",
    "def mean_squared_error(y, t): # 均方误差\n",
    "    return 0.5 * np.sum((y-t)**2)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "我们使用这个函数，来实际地计算一下。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 31,
   "metadata": {},
   "outputs": [],
   "source": [
    "# 设\"2\"为正确解\n",
    "t = [0, 0, 1, 0, 0, 0, 0, 0, 0, 0]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 32,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0.09750000000000003"
      ]
     },
     "execution_count": 32,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# 例1：\"2\"的概率最高的情况(0.6)\n",
    "y = [0.1, 0.05, 0.6, 0.0, 0.05, 0.1, 0.0, 0.1, 0.0, 0.0]\n",
    "mean_squared_error(np.array(y), np.array(t))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 33,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0.5975"
      ]
     },
     "execution_count": 33,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# 例2：\"7\"的概率最高的情况(0.6)\n",
    "y = [0.1, 0.05, 0.1, 0.0, 0.05, 0.1, 0.0, 0.6, 0.0, 0.0]\n",
    "mean_squared_error(np.array(y), np.array(t))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "第一个例子中，正确解为“2”，神经网络输出的最大值（概率最大）是“2”；<br>\n",
    "第二个例子中，正确解为“2”，神经网络输出的最大值（概率最大）是“7”。<br>\n",
    "如实验结果所示，第一个例子的损失函数的值更小，和监督数据之间的误差较小。也就是说，均方误差显示第一个例子的输出结果与监督数据更加吻合。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 交叉熵误差\n",
    "除了均方误差之外，**交叉熵误差(corss entropy error)**也经常被用作损失函数。交叉熵误差如下式所示。\n",
    "$$E = -\\sum_{k}^{ }t_klogy_k$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "这里，log表示以e为底数的自然对数($log_e$)。$y_k$是神经网络的输出，$t_k$正确解标签。并且$t_k$中只有正确解标签的索引为1，其他均为0(one-hot表示)。因此，上式实际上只计算对应正确解标签的输出的自然对数。<br>\n",
    "比如，假设正确解标签的索引是“2”，与之对应的神经网络的输出是0.6，则交叉熵误差是$-log0.6=0.51$。<br>\n",
    "也就是说，**交叉熵误差的值是由正确解标签所对应的输出结果决定的**。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 58,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAXoAAAD8CAYAAAB5Pm/hAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4xLjIsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy8li6FKAAAd5klEQVR4nO3dd3hcV73u8e9S75LVrG5b7t1OZDukOIWQhFRyArlJOAmEgCk3HB7OoedQDjycE8g9l/pwSKiXEAglgAOEdEi3Yztxk2NLtizZGllW731m3T9GsU2wozJl79l6P88zj2c8e/b+zbL8ennttdc21lpERMS74pwuQEREIktBLyLicQp6ERGPU9CLiHicgl5ExOMU9CIiHheWoDfGXGGMOWCMOWiM+Uw49ikiIuFhQp1Hb4yJB2qAtwGNwDbgZmvtvtDLExGRUIWjR78eOGitrbPWjgAPAteFYb8iIhIGCWHYRylw9JTXjcCGN25kjNkEbAJIT08/e8mSJWE4tIhIbAtYy8hYIPjwB/7h+eysFAoykwHYsWNHm7W2YKrHCEfQm9P83j+MB1lr7wPuA6iqqrLbt28Pw6FFRNzNWktL7zAN7QMc6RjgSHs/DR2vPx+go3/k77bPS0lgTl4a5bPSqMhN45IlhWyozAPAGNMwnRrCEfSNQPkpr8uApjDsV0QkJoz5AxzrHqK+vZ/69mCYB38doKGjn6HRwIlt4wyU5KQyJy+Ny5bPpjw3GOhzctOpyE0jOy0x7PWFI+i3AQuNMfMAH3ATcEsY9isi4hpj/gC+rkEOt/XT0D4w/mvw+dHOAUb9JwcykhPimJOXRkVuOucvzB9/nsacvHRKc1JJSojuzPaQg95aO2aMuRN4DIgHfmytrQ65MhGRKPMHLE1dg9S393O4Lfiobwv2zo92DDAWOBnm6UnxzMlLZ0lxJlesKDoR7PPy0ynMTCYu7nSj2s4IR48ea+0jwCPh2JeISCRZa2nvHwkGeWs/dW39HG7rC4Z6+wAjYyeHWdLGw3xpcSZvX1HE3PxgkM/JS6MgIxlj3BPmbyYsQS8i4jZDo37q2/upa+2nrrWPutZ+DrUFn/cOjZ3YLjHeMCcvGOAXLS5k3niYv94zj5UwfzMKehGJWdZa2vpGONTaF3y09J947usa5NTrQYuzU6gsSOe6NSVU5mdQWRAM89KcVBLivb0ajIJeRFzPH7A0dg5wsKXvxONQa/DXnlN656mJ8czLT2dNeQ43nFXG/MIMKvPTqSxIJy1p5sbdzP3mIuI6o/4ADe391B7vo7Yl+DjY0kddax/Dp4yd52ckM78gnWtWlzC/IIMFhcEeekl2qqtOgrqFgl5Eom7UH6C+rZ+a433UtvSOB3svh9v6/26aYtmsVBYUZnD+gjwWFAYDfUFBZkTmmnuZgl5EIsYfsBztGODA8V5qmns5cDwY6nVtfScC3RioyE1jYWEmly6dzYLCDBbNzpzxwy3hpFYUkZBZa2ntHWZ/cy8HmnvZ39xLzfFealt6/+6q0PLcVBYVZnLJ0kIWzc5gYWEmCwozSEmMd7B671PQi8iUDI74qTney/7mHvY397L/WPB558DoiW0KM5NZXJTJuzfMYXFRJotnBwM9PVmR4wS1uoiclrWWY91DvHasZ/zRy2vNPdS39fP6BaKpifEsKsrk8uVFLC7KZElRFouLMslNT3K2ePk7CnoRYdQfoPZ4H/uO9bCvKRjs+4710D14spdekZvG0uJMrllVwtLiYKhX5KZplksMUNCLzDADI2O8dqyH6qYe9vq62Xesh5rmPkb8wbH0lMQ4FhdlceXKIpYVZ7G0ONhLz0zRTJdYpaAX8bCeoVH2+rqp9vWwt6mbvb5u6tr6T1wxmpuexPKSLG4/fy7LirNYXpLFvPwM4tVL9xQFvYhHdA+OUu3rZo+vm92+YKg3tA+ceL8kO4VlJdlcvaqElaXZLC/NoigrxRNrucibU9CLxKCBkTH2+nrY3djF7sZguB9u6z/xftmsVFaUZHNjVTkrSrNZXpJFfkaygxWLkxT0Ii436g9woLmXXY1d7Draxa6j3dS29J6Y+VKSncLKsmzeeXYZK0uzWVmazSzNepFTKOhFXMRai69rkFePdLHzaPCx19d9Yp2X3PQkVpVlc/mKIlaXZbOqLOfEjaNFzkRBL+KggZExdjd288qRTl490sWrR7po6xsGgrejW1Gazbs3zGFNRQ5ry3Mom5WqMXWZMgW9SJRYa2nsHGRHQyevHOlkR0Mn+5t78Y+PwVTmp7NxUT5ry3NYUz6LJcWZJHp8nXSJDgW9SISM+gNUN/Wwvb6D7fWd7DjSSWtvsLeenhTP6vIcPnzhfM6ak8Pa8lkaV5eIUdCLhEnf8BivNHSyvb6DbfWdvHq088SCXuW5qZw3P4+z58zirDmzWDw70/N3NRL3UNCLTFNH/wgvH+7g5cMdbKvvoLqpm4CF+DjDsuIsbl5fQdWcXKrmzmJ2VorT5coMpqAXmaTW3mG2Hm5na10HWw+3U3O8DwieNF1bkcOdFy9g3bxczqqYpVUaxVX00yhyBm19w2ypa2dLXTsvHWrnUGvwgqT0pHjOnpvLdWtKOacylxWl2SQnaD11cS8Fvci47sHRE6H+0qF2DhzvBSAjOYF1c2dxY1U5GyrzWFGSpfF1iSkKepmxhkb97Gjo5PmDbbx4sI09vuAYe0piHOvm5nLd2hLeUpnHytJsBbvENAW9zBiBgGV/cy/P1bby/ME2Xj7cwfBYgIQ4w5ryHO68ZCHnzc9jTUWOhmLEUxT04mmtvcM8V9vKszXBcG/rGwFgYWEGt2yo4IKF+ayfl0eGTp6Kh+mnWzxlzB/glSNd/O1AC8/UtFLd1ANAXnoS5y/M54KFBZy/IJ+ibE13lJlDQS8xr7V3mGdqWvnr/haerW2ld2iM+DjD2RWz+OTli7lwUQHLirN0yzuZsRT0EnOstVQ39fDUay08vf84uxq7ASjMTObtK4q4aHEh5y3IJztVt74TAQW9xIihUT8vHmrjiX3BcD/eM4wxsKY8h09ctoiLFheyvCRLKzuKnIaCXlyrs3+Ep/e38Pi+Zp6taWNw1E96UjwbFxXw1qWzuXhxAXm6a5LIhBT04irHugd5bG8zj1Uf5+X6DvwBS1FWCjecXcrblhVxTmWupj6KTJGCXhzX0N7PX/Y285e9zew62gUEpz9++ML5XLZ8NitLszUkIxICBb044nBbP4/sOcYje46dmAK5qiybT12xmMuXFzG/IMPhCkW8I6SgN8a8C/gSsBRYb63dHo6ixJuOdgzwx91N/Hn3yXBfW5HDv1+1lMuXF1Gem+ZwhSLeFGqPfi/wT8C9YahFPKild4g/7TrGH3c38eqR4LDMmvJguF+5spiSnFSHKxTxvpCC3lr7GqDxU/k7vUOjPLq3mc07m3jxUBsBC8uKs/j0FUu4elWxeu4iURa1MXpjzCZgE0BFRUW0DitRMuYP8FxtG7971cfj1c0MjwWoyE3jf1+8gOvWlLCgMNPpEkVmrAmD3hjzJFB0mrfustZunuyBrLX3AfcBVFVV2UlXKK62v7mH325v5A87fbT1jZCTlsiNVeW8Y20pZ1Xk6H97Ii4wYdBbay+NRiESO7oHRvnDTh+/2XGUvb4eEuMNlywp5IazyrhocSFJCVq7XcRNNL1SJiUQsLxU186vth3l0epmRsYCLC/J4kvXLOPaNaXkpic5XaKInEGo0yuvB74DFAB/NsbstNZeHpbKxBVae4f57Y5GHtx2hIb2AbJSErh5XTk3ritneUm20+WJyCSEOuvm98Dvw1SLuIS1lq2HO7h/SwOPVzcz6resn5fLxy9dxBUrikhJ1BIEIrFEQzdyQt/wGA/taOTnWxqobekjOzWRW8+Zyy0bKlhQqCtVRWKVgl441NrHz16s56FXfPQNj7G6LJt73rmKa1aXqPcu4gEK+hnKWstztW38+IXD/O1AK0nxcVy9qpjbzp3LmvIcp8sTkTBS0M8ww2N+Nr/axA+fr6PmeB/5Gcn869sWcfP6Cgoytba7iBcp6GeIroERHth6hJ+8UE9b3zBLi7P473et5urVxVrfXcTjFPQe19w9xI+er+MXW4/QP+Jn46ICPrixknPn5+mqVZEZQkHvUQ3t/fzP3w7x0CuNBCxcs6qYD144n6XFWU6XJiJRpqD3mIMtfXzvrwfZvKuJ+DjDTesq2LSxUitGisxgCnqPONTax7eerOWPu5tISYjn9nPnsmljJYVZKU6XJiIOU9DHuIb2fr71VC1/eNVHckI8H9w4nw9cMI+8DM2gEZEgBX2MOt4zxLeequXX246SEG94/wWVbNpYSb4CXkTeQEEfY7oHRvneMwf56Qv1BKzllg0V3HnJAgozNUQjIqenoI8Rw2N+7n+pge88fZCeoVGuX1PKx9+2SCdZRWRCCnqXs9byyJ5m7n70NY52DHLBwnw++/alLCvRNEkRmRwFvYvt9XXz5T/t4+XDHSwpyuRn71vPxkUFTpclIjFGQe9CHf0j3PPYfh7cdpRZaUl89foV3LSugvg4XckqIlOnoHcRf8Dy4LYj3PPYAXqHxnjfefP4l7cuJDs10enSRCSGKehdYq+vm8/9fg+7G7s5pzKXL1+3gkWzM50uS0Q8QEHvsP7hMb7xRA0/fuEwuenJfOumNVy7ukQLjolI2CjoHfRMTSuf+90efF2D3Ly+gs9csYTsNA3TiEh4Kegd0D04ylf/vI9fb29kfkE6v/nQW1g3N9fpskTEoxT0UfbXAy189qE9tPQO8eGL5vOxty7UfVlFJKIU9FEyMDLGV//8Gg9sPcKi2Rnce+t5rNa9WUUkChT0UbDzaBcf/9VO6tv72bSxkn+7bJFu3yciUaOgj6BAwPI/zxzi/z5Rw+zMZH7x/nN4y/w8p8sSkRlGQR8hrb3D/Ouvd/JcbRtXrSrmP69fqQufRMQRCvoIePFgG//y4E56h0b5z+tXcvP6cs2LFxHHKOjDyFrLvc/W8fVH9zMvP52fv389S4q0yqSIOEtBHya9Q6N88je7ebS6matWFvP1d64iPVnNKyLOUxKFweG2ft7//7ZR3z7Av1+1lDvOn6ehGhFxDQV9iF482MaHH3iFOAM/v2ODZtWIiOso6EPwwNYGvri5mnn56fzoPeuoyNNt/UTEfRT00xAIWL726H7ufbaOixYX8J2b15KZoqmTIuJOCvopGh7z88nf7ObhXU3ces4cvnTtct35SURcLaSgN8bcA1wDjACHgNuttV3hKMyNeoZG+dD9O3jxUDufumIxH75wvk66iojrxYX4+SeAFdbaVUAN8NnQS3Knjv4RbvnBFl4+3MF/v2s1H7logUJeRGJCSEFvrX3cWjs2/nILUBZ6Se7T0jPETfe9RO3xPn5wWxU3nO3JrykiHhVqj/5U7wP+cqY3jTGbjDHbjTHbW1tbw3jYyPJ1DXLjvS/R2DnIT25fx8VLCp0uSURkSiYcozfGPAkUneatu6y1m8e3uQsYAx44036stfcB9wFUVVXZaVUbZY2dA/yve7fQMzTK/Xds4Ow5s5wuSURkyiYMemvtpW/2vjHmPcDVwFuttTER4JPR3D3ELT/YSu/QKL94/zmsLMt2uiQRkWkJddbNFcCngQuttQPhKcl5Lb1D3PKDLXT0j3D/HesV8iIS00Ido/8ukAk8YYzZaYz5fhhqclRn/wj//MOtHOse4ie3r2NthYZrRCS2hdSjt9YuCFchbjA44ud944uT/fS961g3N9fpkkREQhbOWTcxbcwf4KO/fJWdR7v49k1rOHdBvtMliYiEhYKe4A1DPr+5midfO85/XLucK1YUO12SiEjYKOiB7/3tEL98+QgfuWg+t71lrtPliIiE1YwP+serm7nnsQO8Y00Jn7x8sdPliIiE3YwO+gPNvXz8VztZXZbN3Tes0to1IuJJMzboO/tH+MDPtpOWnMC9t1aRkhjvdEkiIhExI4PeH7B89Jev0tw9xPf/+WyKslOcLklEJGJm5I1HvvN0Lc8fbONrN6zU+jUi4nkzrke/pa6dbz9Vy/VrS7mxqtzpckREIm5GBX173zAfe/BV5uSl85V3rNDJVxGZEWZM0AcClk/8Zhed/aN895a1ZCTPyFErEZmBZkzQ37+lgb8eaOWuq5ayvESrUYrIzDEjgv5I+wBfe3Q/GxcVcNtb5jhdjohIVHk+6AMBy6cf2k2cMdz9Tys1Li8iM47ng/4XLx/hpbp27rpqKSU5qU6XIyISdZ4O+sbOAf7rkdc4f0E+N63TVEoRmZk8HfSf/8NeLPBfGrIRkRnMs0H/9P7j/PVAKx+/dBHluWlOlyMi4hhPBv3wmJ8v/3Ef8wvSec+5c50uR0TEUZ4M+p+8UE99+wBfuGY5SQme/IoiIpPmuRRs6RniO0/VcunS2Vy4qMDpckREHOe5oL/70f2M+i2fv3qp06WIiLiCp4J+X1MPv3vFxx0XzGNOXrrT5YiIuIKngv6bT9aQmZLAhy6c73QpIiKu4Zmg39PYzeP7jvOBCyrJTk10uhwREdfwTNB/88kaslMTuf28uU6XIiLiKp4I+p1Hu3hqfwubNlaSmaLevIjIqTwR9N94oobc9CRdHCUichoxH/Q7Gjp5pqaVTRsrddcoEZHTiPmg/9HzdWSnJuqGIiIiZxDTQe/rGuSx6uPctL6ctCT15kVETiemg/7nWxqw1nLrOerNi4icScwG/dCon1++fITLlhVRNkvLEIuInEnMBv3mnT66BkZ5r+bNi4i8qZgMemstP3mhniVFmWyYl+t0OSIirhZS0BtjvmKM2W2M2WmMedwYUxKuwt7M1sMd7G/u5fbz5uoWgSIiEwi1R3+PtXaVtXYN8CfgC2GoaUI/e6menLRErltTGo3DiYjEtJCC3lrbc8rLdMCGVs7EugdHeXJfC+9YU0pKYnykDyciEvNCnnxujPkqcBvQDVz8JtttAjYBVFRUTPt4j+1tZsQf4B1r1ZsXEZmMCXv0xpgnjTF7T/O4DsBae5e1thx4ALjzTPux1t5nra2y1lYVFEz/Fn+bd/mYk5fG6rLsae9DRGQmmbBHb629dJL7+gXwZ+CLIVX0Jlp6hnjxUDsfvWShTsKKiExSqLNuFp7y8lpgf2jlvLmHdzVhLVy7OiqTe0REPCHUMfq7jTGLgQDQAHwo9JLO7OFdTawozWJBYUYkDyMi4ikhBb219oZwFTKRutY+djd2c9eVS6N1SBERT4iZK2Mf3tWEMXCNhm1ERKYkJoLeWsvmnU2cMy+PouwUp8sREYkpMRH0tS19HG7r5+rVxU6XIiISc2Ii6J+taQXgosWFDlciIhJ7YiPoa9uYX5BOaU6q06WIiMQc1wf90KifrXXtXLBw+lfTiojMZK4P+m31HQyPBbhwkYJeRGQ6XB/0z9a0khQfx4ZK3WBERGQ6XB/0z9W2UTV3FmlJIS+0KSIyI7k66I/3DLG/uVfj8yIiIXB10D9X2wbAxkX5DlciIhK7XB30z9a0kp+RzNKiLKdLERGJWa4N+kDA8vzBNi5YmE9cnNaeFxGZLtcGfXVTDx39Ixq2EREJkWuD/qW64Pj8eQsU9CIioXBt0O/x9VCak0phplarFBEJhWuDvtrXzfISnYQVEQmVK4O+d2iUurZ+VpZmO12KiEjMc2XQVzf1ALBCQS8iEjJXBv1eXzegoBcRCQfXBn1RVgoFmclOlyIiEvNcGfR7fN2sKNWJWBGRcHBd0PcPj1HX1q9hGxGRMHFd0O871oO1sKJEQS8iEg6uC/rXT8SuLFPQi4iEg+uCfo+vm4LMZGZn6YpYEZFwcF3Q7/V1s0JXxIqIhI2rgn5wxM/Blj5dESsiEkauCvp9x3oIWFiuoBcRCRtXBX110/iJWAW9iEjYuCro9zR2k5eeRHG2TsSKiISLq4L+wPFelhZnYYxuHSgiEi6uCnpf5yDlualOlyEi4imuCfqhUT/t/SOU5ijoRUTCKSxBb4z5hDHGGmOmfYNXX9cgACUKehGRsAo56I0x5cDbgCOh7KdpPOjVoxcRCa9w9Oi/AXwKsKHsxNepHr2ISCSEFPTGmGsBn7V21yS23WSM2W6M2d7a2voP7zd1DRJnoEhTK0VEwiphog2MMU8CRad56y7gc8BlkzmQtfY+4D6Aqqqqf+j9N3YNMjsrhcR415wfFhHxhAmD3lp76el+3xizEpgH7Bqf914GvGKMWW+tbZ5qIU1dgxqfFxGJgAmD/kystXuAwtdfG2PqgSprbdt09ufrGmRt+azpliMiImfginESf8DS3D1E6Sz16EVEwm3aPfo3stbOne5nW3uHGfVbzbgREYkAV/TofSfm0GvGjYhIuLks6NMcrkRExHtcEfRNJ5Y/UI9eRCTcXBH0vs5BslISyExJdLoUERHPcUXQN3UN6kSsiEiEuCLofV2DlGlqpYhIRLgm6NWjFxGJDMeDvmdolN6hMS1/ICISIY4HfZNuOCIiElGOB/3r69Br+QMRkchwPOh1ZykRkchyPOh9XUMkxhsKMpKdLkVExJNcEPSDFGenEhdnnC5FRMSTHA963XBERCSyHA96X6fm0IuIRJKjQT/qD3C8d0jLE4uIRJCjQd/aO4y1UJStHr2ISKQ4GvSDo34A0pPjnSxDRMTTHA36ofGgT05Q0IuIRIqjQT88FgAgOdHxc8IiIp7lbNCPjgd9goJeRCRSHO7Ra+hGRCTSXDF0k6KhGxGRiNHJWBERj3NFj15j9CIikeOOoNfQjYhIxDg86yY4dJOSqKEbEZFIcUePXkM3IiIR44oefVK8gl5EJFIc79EnJ8RhjG46IiISKY4HvcbnRUQiy/ErYzU+LyISWY6vdaOplSIikeXslbFjfl0VKyISYY736LXOjYhIZIWUssaYLxljfMaYneOPK6fy+eCsG/XoRUQiKSEM+/iGtfb/TOeDOhkrIhJ5jk+vVNCLiESWsdZO/8PGfAl4L9ADbAf+zVrbeYZtNwGbxl+uAPZO+8Dekg+0OV2ES6gtTlJbnKS2OGmxtTZzqh+aMOiNMU8CRad56y5gC8E/AAt8BSi21r5vwoMas91aWzXVYr1IbXGS2uIktcVJaouTptsWE47RW2svnWQBPwD+NNUCREQkskKddVN8ysvr0XCMiIjrhDrr5uvGmDUEh27qgQ9O8nP3hXhcL1FbnKS2OEltcZLa4qRptUVIJ2NFRMT9NLdRRMTjFPQiIh4X0aA3xlxhjDlgjDlojPnMad43xphvj7+/2xhzViTrcdIk2uLd422w2xjzojFmtRN1RtpE7XDKduuMMX5jzDujWV80TaYtjDEXjS8vUm2MeSbaNUbLJP5+ZBtj/miM2TXeFrc7UWc0GGN+bIxpMcacdnLLtHLTWhuRBxAPHAIqgSRgF7DsDdtcCfwFMMA5wNZI1ePkY5JtcS4wa/z5273YFpNph1O2exp4BHin03U7+DORA+wDKsZfFzpdt4Nt8Tnga+PPC4AOIMnp2iPUHhuBs4C9Z3h/yrkZyR79euCgtbbOWjsCPAhc94ZtrgN+ZoO2ADlvmLLpFRO2hbX2RXvyquItQFmUa4yGyfxMAHwUeAhoiWZxUTaZtrgF+J219giAtdar7TGZtrBApgnedzSDYNCPRbfM6LDWPkvw+53JlHMzkkFfChw95XXj+O9NdRsvmOr3vIPgv9heM2E7GGNKCV6T8f0o1uWEyfxMLAJmGWP+ZozZYYy5LWrVRddk2uK7wFKgCdgDfMxaG4hOea4z5dwMx+qVZ3K6O36/cS7nZLbxgkl/T2PMxQSD/vyIVuSMybTDN4FPW2v9Hr9p/GTaIgE4G3grkAq8ZIzZYq2tiXRxUTaZtrgc2AlcAswHnjDGPGet7Yl0cS405dyMZNA3AuWnvC4j+K/xVLfxgkl9T2PMKuCHwNutte1Rqi2aJtMOVcCD4yGfD1xpjBmz1v4hOiVGzWT/frRZa/uBfmPMs8BqwGtBP5m2uB242wYHqQ8aYw4DS4CXo1Oiq0w5NyM5dLMNWGiMmWeMSQJuAh5+wzYPA7eNn0U+B+i21h6LYE1OmbAtjDEVwO+AWz3YY3vdhO1grZ1nrZ1rrZ0L/Bb4iAdDHib392MzcIExJsEYkwZsAF6Lcp3RMJm2OELwfzYYY2YDi4G6qFbpHlPOzYj16K21Y8aYO4HHCJ5V/7G1ttoY86Hx979PcFbFlcBBYIDgv9qeM8m2+AKQB3xvvDc7Zj22Yt8k22FGmExbWGtfM8Y8CuwGAsAPrbWeW09qkj8XXwF+aozZQ3Do4tPWWk8uXWyM+SVwEZBvjGkEvggkwvRzU0sgiIh4nK6MFRHxOAW9iIjHKehFRDxOQS8i4nEKehERj1PQi4h4nIJeRMTj/j8h5QKBWoRqNQAAAABJRU5ErkJggg==\n",
      "text/plain": [
       "<Figure size 432x288 with 1 Axes>"
      ]
     },
     "metadata": {
      "needs_background": "light"
     },
     "output_type": "display_data"
    }
   ],
   "source": [
    "# 自然对数 y = logx 的图像\n",
    "import matplotlib.pyplot as plt\n",
    "import numpy as np\n",
    "\n",
    "x = np.arange(0.001, 1.0, 0.01)\n",
    "y = np.log(x)\n",
    "\n",
    "plt.plot(x, y)\n",
    "plt.ylim(-5, 0)# y范围(-5,.)\n",
    "plt.xlim(0, 1) # x范围(0,1)\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "如图所示，x=1时，y=0；随着x向0靠近，y逐渐变小。因此，正确解标签对应的输出越大，交叉熵误差越接近0.\n",
    "\n",
    "\n",
    "下面，我们用代码来实现交叉熵误差。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 75,
   "metadata": {},
   "outputs": [],
   "source": [
    "def cross_entropy_error(y, t):\n",
    "    delta = 1e-7 # 1e-7 为 1 * 10^-7，即0.0000001\n",
    "    return -np.sum(t * np.log(y + delta))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "这里，参数y和t是NumPy数组。函数内部在计算np.log时，加上了一个微小值delta。这是因为，当出现np.log(0)时，np.log(0)会变成负无限大的-inf，这样一来就会导致后续计算无法进行。作为保护性对策，添加一个微小值可以防止负无限大的发生。<br>\n",
    "\n",
    "下面，我们使用cross_entropy_error(y, t)来进行一些简单的计算。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 77,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0.510825457099338"
      ]
     },
     "execution_count": 77,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "t = [0, 0, 1, 0, 0, 0, 0, 0, 0, 0] # \"2\"为正确解\n",
    "y = [0.1, 0.05, 0.6, 0.0, 0.05, 0.1, 0.0, 0.1, 0.0, 0.0] # 正确解标签对应的输出是0.6\n",
    "cross_entropy_error(np.array(y), np.array(t))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 78,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "2.302584092994546"
      ]
     },
     "execution_count": 78,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "y = [0.1, 0.05, 0.1, 0.0, 0.05, 0.1, 0.0, 0.6, 0.0, 0.0] # 正确解标签对应的输出是0.1\n",
    "cross_entropy_error(np.array(y), np.array(t))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "在第一个例子中，正确解标签对应的输出（概率）是0.6，此时交叉熵误差大约为0.51。<br>\n",
    "在第二个例子中，正确解标签对应的输出（概率）为0.1，此时交叉熵误差大约为2.3。<br>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### mini-batch学习\n",
    "机器学习使用训练数据进行学习。严格的说，就是针对训练数据计算损失函数的值，找出使该值尽可能小的参数。<br>\n",
    "因此，计算损失函数时必须将所有的训练数据作为对象。也就是说，如果训练数据由100个的话，我们就要把这100个损失函数的总和作为学习的指标。\n",
    "前面介绍的损失函数的例子中考虑的都是针对单个数据的损失函数。如果要求所有训练数据的损失函数的总和，以交叉熵误差为例，可以写成下面的式子。<br>\n",
    "$$\n",
    "E = -\\frac{1}{N} \\sum_{n}^{ } \\sum_ {k}^{ } t_{nk} log_{nk}\n",
    "$$\n",
    "这里，假设数据有 N 个，$t_{nk}$ 表示第 n 个数据的第 k 个元素的值($y_{nk}$是神经网络的输出，$t_{nk}$是监督数据)，最后除以N进行正规化，可以求单个数据的“平均损失函数”。\n",
    "\n",
    "\n",
    "如果遇到大数据，数据量会有几百万、几千万之多，这种情况下以全部数据为对象计算损失函数是不现实的。因此，我们从全部数据中选出一部分，作为全部数据的“近似”。神经网络的学习也是从训练数据中选出一批数据(称为mini-batch，小批量)，然后对每个mini-batch进行学习。比如，从60000个训练数据中随机选择100笔，再用着100笔数据进行学习。这种方式称为**mini-batch学习**。\n",
    "\n",
    "\n",
    "下面来编写从训练数据中随机选择指定个数的数据的代码，以进行mini-batch学习。在这之间，先来看一下用于读入MNIST数据集的代码。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "(60000, 784)\n",
      "(60000, 10)\n"
     ]
    }
   ],
   "source": [
    "import sys, os\n",
    "sys.path.append(os.pardir)\n",
    "import numpy as np\n",
    "from dataset.mnist import load_mnist\n",
    "\n",
    "(x_train, t_train), (x_test, t_test) = load_mnist(normalize=True, one_hot_label=True)\n",
    "\n",
    "print(x_train.shape) # (60000,784)\n",
    "print(t_train.shape) # (60000,10)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "之前介绍过，load_mnist 函数是用于读入 MNIST 数据集的函数，它会读入训练数据和测试数据。读入数据时，通过设定 one_host_label=True，可以得到 one-hot 表示(即仅正确解标签为1，其余为0的数据结构)。<br>\n",
    "读入上面的 MNIST 数据后，训练数据有60000个，输入数据是784维(28x28)的图像数据，监督数据是10维的数据。因此上面的 x_train、t_train 的形状分别是(60000, 784)和(60000, 10)<br>\n",
    "\n",
    "\n",
    "那么如何从这个训练数据中随机抽取10笔数据呢？可以使用 NumPy 的 np.random.choice()。\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([49155, 42364, 30504,  7265,  3586, 26023, 50966, 58737, 25345,\n",
       "       41773])"
      ]
     },
     "execution_count": 23,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# 使用 np.random.choice() 可以从指定的数字中随机选择想要的数字。\n",
    "np.random.choice(60000, 10) # 从0到59999之间随机选择10个数字"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "我们只需指定这些随机选出的索引，取出 mini-batch，然后使用这个 mini-batch 计算损失函数即可。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 44,
   "metadata": {},
   "outputs": [],
   "source": [
    "train_size = x_train.shape[0] # 训练数据的总数，60000\n",
    "batch_size = 10 # 批数量，这里取10个\n",
    "batch_mask = np.random.choice(train_size, batch_size) # 从训练数据的总数中，随机取出 batch_size（这里为10） 个数据的索引\n",
    "x_batch = x_train[batch_mask] # 取出的10个训练数据\n",
    "t_batch = t_train[batch_mask] # 训练数据对应的测试数据"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "mini-batch 的损失函数是利用一部分样本数据来近似地计算整体。也就是说，用随机选择的小批量数据(mini-batch)作为全体训练数据的近似值。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### mini-batch 版交叉熵误差的实现\n",
    "这里，我们实现一个可以同时处理单个数据和批量数据(数据作为batch集中输入)两种情况的函数。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 47,
   "metadata": {},
   "outputs": [],
   "source": [
    "def cross_entropy_error(y, t): # y是神经网络的输出，t是监督数据\n",
    "    if y.ndim == 1:  # y的维度为1时，即求单个数据的交叉熵误差时，需要改变数据的形状\n",
    "        t = t.reshape(1, t.size)\n",
    "        y = y.reshape(1, y.size)\n",
    "    \n",
    "    batch_size = y.shape[0] # 批数量\n",
    "    return -np.sum(t * np.log(y + 1e-7)) / batch_size"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "这里，y是神经网络的输出，t是监督数据。y的维度为1时，即求单个数据的交叉熵误差时，需要改变数据的形状。并且，当输入为mini-batch时，要用batch的个数进行正规化，计算单个数据的平均交叉熵误差。\n",
    "\n",
    "\n",
    "此外，当监督数据是标签形式(非one-hot表示，而是像 “2”、“7” 这样的标签)时，交叉熵误差可通过如下代码实现。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 49,
   "metadata": {},
   "outputs": [],
   "source": [
    "def cross_entropy_error(y, t):\n",
    "    if y.ndim == 1:\n",
    "        t = t.reshape(1, t.size)\n",
    "        y = y.reshape(1, y.size)\n",
    "    \n",
    "    batch_zie = y.shape[0]\n",
    "    return -np.sum(np.log(y[np.arange(batch_size),t] + 1e-7)) / batch_size # 这一行要仔细理解"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "由于 one-hot 表示中 t 为0的元素的交叉熵误差也为0，因此针对这些元素的计算可以忽略。换言之，如果可以获得神经网络在正确解标签处的输出，就可以计算交叉熵误差。因此，t 为 one-hot 表示时通过 t * np.log(y) 计算的地方，再 t 为标签形式时，可用 np.log( y[np.arange(batch_zie), t] ) 实现相同的处理（为了便于观察，这里省略了微小值 1e-7）。<br>\n",
    "\n",
    "简单介绍一下 np.log( y[np.arange(batch_size), t] )。np.arange(batch_size) 会生成一个从0到 batch_size-1 的数组。比如当 batch_size 为5时，np.arange(batch_size)会生成一个 Numpy 数组[0, 1, 2, 3, 4]。因为 t 中标签是以[2, 7, 0, 9, 4]的形式存储的，所以 y[np.arange(batch_size), t] 能抽取出各个数据的正确解标签对应的神经网络的输出。在这个例子中，y[np.arange(batch_size), t] 会生成 NumPy 数组 [ y[0,2], y[1,7], y[2,0], y[3,9], y[4,4] ]。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 61,
   "metadata": {},
   "outputs": [
    {
     "ename": "SyntaxError",
     "evalue": "invalid syntax (<ipython-input-61-67ec01f055e9>, line 5)",
     "output_type": "error",
     "traceback": [
      "\u001b[1;36m  File \u001b[1;32m\"<ipython-input-61-67ec01f055e9>\"\u001b[1;36m, line \u001b[1;32m5\u001b[0m\n\u001b[1;33m    [4,4,...]])\u001b[0m\n\u001b[1;37m              ^\u001b[0m\n\u001b[1;31mSyntaxError\u001b[0m\u001b[1;31m:\u001b[0m invalid syntax\n"
     ]
    }
   ],
   "source": [
    "y = np.array([[0,2,...,\n",
    "         [1,7,...],\n",
    "         [2,0,...],\n",
    "         [3,9,...],\n",
    "         [4,4,...]])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 为什么要设定损失函数\n",
    "在进行神经网络的学习时，不能将识别精度作为指标。因为如果以识别精度为指标，则参数的导数在绝大多数地方都会变成。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.4"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
